US COVID Data

Data Prep

us <- read.csv("https://raw.githubusercontent.com/JaclynCoate/6373_Time_Series/master/TermProject/Data/USdaily.7.26.20.csv", header = T, strip.white = T)
us <- transform(us, date = as.Date(as.character(date), "%Y%m%d"))
us <- subset(us, select = -c(states, dateChecked, hospitalized, lastModified, total, posNeg, totalTestResultsIncrease, hash))
us[is.na(us)] <- 0
us = us[order(as.Date(us$date, format = "%Y%m%d")),]
head(us)
##           date positive negative pending hospitalizedCurrently
## 187 2020-01-22        2        0       0                     0
## 186 2020-01-23        2        0       0                     0
## 185 2020-01-24        2        0       0                     0
## 184 2020-01-25        2        0       0                     0
## 183 2020-01-26        2        0       0                     0
## 182 2020-01-27        2        0       0                     0
##     hospitalizedCumulative inIcuCurrently inIcuCumulative
## 187                      0              0               0
## 186                      0              0               0
## 185                      0              0               0
## 184                      0              0               0
## 183                      0              0               0
## 182                      0              0               0
##     onVentilatorCurrently onVentilatorCumulative recovered death
## 187                     0                      0         0     0
## 186                     0                      0         0     0
## 185                     0                      0         0     0
## 184                     0                      0         0     0
## 183                     0                      0         0     0
## 182                     0                      0         0     0
##     totalTestResults deathIncrease hospitalizedIncrease negativeIncrease
## 187                2             0                    0                0
## 186                2             0                    0                0
## 185                2             0                    0                0
## 184                2             0                    0                0
## 183                2             0                    0                0
## 182                2             0                    0                0
##     positiveIncrease
## 187                0
## 186                0
## 185                0
## 184                0
## 183                0
## 182                0

Current Hospitalization Realization

Traits:

  • Heavy wandering behavior
  • What appears to be some noise that could be pseudo-cyclic behavior hidden by the large numbers.
ggplot(data = us, aes(x=date, y=hospitalizedCurrently))+
  geom_line(color="orange")+
  labs(title = "Current COVID Hospitalized Cases US", y = "Thousands", x = "") +
    theme_fivethirtyeight()

Stationarity

It is difficult to assume stationarity for this data due to multiple factors. We are working under the assumption that COVID is a novel virus and cases as well as hospitalizations will eventually return to zero. This being said our current modeling techniques do things such as return to the median or mimic the previously seen trends. Also, we see a heavy wandering trend in both new cases and hospitalization would be dependent on this as well as time. We will review the data and see what, if any, non-stationary components reveal themselves and model the data accordingly.

Forecast Current Hospitalizations

It is difficult to assume stationarity for this data due to multiple factors. We are working under the assumption that COVID is a novel virus and cases as well as hospitalizations will eventually return to zero. This being said our current modeling techniques do things such as return to the median or mimic the previously seen trends. Also, we see a heavy wandering trend in both new cases and hospitalization would be dependent on this as well as time. We will review the data and see what, if any, non-stationary components reveal themselves and model the data accordingly.

Variable Review

When reviewing the variables and the scatter plot from our previous EDA we can see that there are some correlations that were expected, for example currentHospitalizations is correlated with the variables that refelct ICU and Ventilaor patients. These metrics wouldn’t exist without hospitalizations. These variables also cannot help up predict hospitalizations because they after occurances. So we will actually be leveraing variables such as positive and positive increase in order to see if there is some sort of correlation between hospitalized patients and the number of positive cases.

  • Positive Tests Trend
ggplot(data = us, aes(x=date, y=positive))+
  geom_line(color="orange")+
  labs(title = "Total Cases US", y = "Millions", x = "") +
    theme_fivethirtyeight()

  • Positive Increase Trend
ggplot(data = us, aes(x=date, y=positiveIncrease))+
  geom_line(color="orange")+
  labs(title = "Increase in COVID Cases US", y = "Thousands", x = "") +
    theme_fivethirtyeight()

MLR w/ Correlated Errors for Currently Hospitalized Patients

Forecast Independent Variables

  1. Forecast Increase Positive Cases
    1. Evaluation: Slowly dampening ACF and heavy wandering consistent with a (1-B) factor. As well as frequency peaks at 0 and .14. Consistent with (1-B) and seasonailty component of 7.
#a
plotts.sample.wge(us$positiveIncrease)

## $autplt
##  [1] 1.0000000 0.9740745 0.9439116 0.9109437 0.8869924 0.8711719 0.8605261
##  [8] 0.8462935 0.8144161 0.7772852 0.7356766 0.7045767 0.6829609 0.6646632
## [15] 0.6432819 0.6088202 0.5688872 0.5293417 0.5001353 0.4761589 0.4592724
## [22] 0.4426271 0.4148783 0.3789333 0.3458221 0.3182895
## 
## $freq
##  [1] 0.005347594 0.010695187 0.016042781 0.021390374 0.026737968
##  [6] 0.032085561 0.037433155 0.042780749 0.048128342 0.053475936
## [11] 0.058823529 0.064171123 0.069518717 0.074866310 0.080213904
## [16] 0.085561497 0.090909091 0.096256684 0.101604278 0.106951872
## [21] 0.112299465 0.117647059 0.122994652 0.128342246 0.133689840
## [26] 0.139037433 0.144385027 0.149732620 0.155080214 0.160427807
## [31] 0.165775401 0.171122995 0.176470588 0.181818182 0.187165775
## [36] 0.192513369 0.197860963 0.203208556 0.208556150 0.213903743
## [41] 0.219251337 0.224598930 0.229946524 0.235294118 0.240641711
## [46] 0.245989305 0.251336898 0.256684492 0.262032086 0.267379679
## [51] 0.272727273 0.278074866 0.283422460 0.288770053 0.294117647
## [56] 0.299465241 0.304812834 0.310160428 0.315508021 0.320855615
## [61] 0.326203209 0.331550802 0.336898396 0.342245989 0.347593583
## [66] 0.352941176 0.358288770 0.363636364 0.368983957 0.374331551
## [71] 0.379679144 0.385026738 0.390374332 0.395721925 0.401069519
## [76] 0.406417112 0.411764706 0.417112299 0.422459893 0.427807487
## [81] 0.433155080 0.438502674 0.443850267 0.449197861 0.454545455
## [86] 0.459893048 0.465240642 0.470588235 0.475935829 0.481283422
## [91] 0.486631016 0.491978610 0.497326203
## 
## $db
##  [1]  14.3997611  15.8955909   7.7642205   8.7050186   3.8198773
##  [6]  -1.1863576   2.6815084  -0.4667544  -1.0582274  -3.9729372
## [11]  -4.4388886  -5.3739230  -7.0574666  -3.9001118  -5.3732547
## [16]  -5.9203161  -6.0265835  -6.3073466  -8.7427682  -8.5315257
## [21]  -8.1694048  -8.8726978  -6.2228751  -5.6255282  -5.7869081
## [26]  -2.3190255  -0.6895940 -18.2510780 -12.7252753 -17.9895004
## [31] -13.1495229 -17.1809062 -12.9879824 -18.5885318 -14.0950716
## [36] -19.2823521 -16.1802599 -22.2193540 -17.1778910 -15.9864725
## [41] -13.8720977 -14.3606703 -12.3482825 -19.0159734 -12.8299009
## [46] -16.1634117 -19.2402721 -16.9940391 -19.9193164 -24.4462952
## [51] -12.2184745 -14.1347405 -11.1407485 -19.5608640 -22.3406178
## [56] -26.2948840 -15.9219942 -19.7896219 -16.3429699 -16.6545137
## [61] -13.2991307 -20.9465604 -12.5392646 -23.3015328 -21.5282804
## [66] -17.4748203 -26.6390880 -24.5923917 -17.5647166 -19.7282494
## [71] -17.6845420 -23.0825883 -19.0743556 -19.4958696 -19.5138333
## [76] -20.4270518 -18.7343253 -16.8982195 -14.1306252 -14.2054286
## [81] -15.0466299 -13.4474119 -14.7662082 -15.2848488 -14.7767269
## [86] -17.2543647 -17.5524024 -16.7680943 -23.6931393 -19.6888637
## [91] -22.9328710 -17.1805763 -15.7422487
## 
## $dbz
##  [1]  12.1744244  11.8068871  11.1912920  10.3234897   9.1987290
##  [6]   7.8134517   6.1690554   4.2794940   2.1855400  -0.0227779
## [11]  -2.1856123  -4.0930116  -5.5814407  -6.6274820  -7.3078903
## [16]  -7.6988921  -7.8526856  -7.8271468  -7.6919809  -7.5061868
## [21]  -7.3030888  -7.0977786  -6.9055053  -6.7545171  -6.6859448
## [26]  -6.7448940  -6.9710578  -7.3933729  -8.0283548  -8.8797888
## [31]  -9.9376266 -11.1745795 -12.5395304 -13.9486256 -15.2804168
## [36] -16.3913300 -17.1665125 -17.5812779 -17.7097183 -17.6695854
## [41] -17.5637030 -17.4562894 -17.3753921 -17.3239844 -17.2913960
## [46] -17.2628022 -17.2262162 -17.1767997 -17.1186074 -17.0640034
## [51] -17.0309217 -17.0382729 -17.1002705 -17.2209453 -17.3905128
## [56] -17.5857017 -17.7761429 -17.9367144 -18.0606665 -18.1649087
## [61] -18.2832201 -18.4521578 -18.6979308 -19.0286543 -19.4319317
## [66] -19.8765383 -20.3176913 -20.7053075 -20.9929698 -21.1442791
## [71] -21.1356798 -20.9585908 -20.6228760 -20.1586258 -19.6118560
## [76] -19.0349178 -18.4769322 -17.9782467 -17.5690679 -17.2703544
## [81] -17.0951532 -17.0493949 -17.1318377 -17.3332302 -17.6350503
## [86] -18.0085946 -18.4157858 -18.8133902 -19.1613062 -19.4324915
## [91] -19.6189508 -19.7292633 -19.7788462
b. Differencing Data: much more stationary data and have surfaced a seasonaly component = 7 seen on ACF peaks at 7, 14, 21
#b
inpos.diff = artrans.wge(us$positiveIncrease, phi.tr = 1, lag.max = 100)

c. Seaonality Transformation: stationary data, and an ACF that reflects data other than whitenoise to be modeled.

#c
inpos.diff.seas = artrans.wge(inpos.diff,phi.tr = c(0,0,0,0,0,0,1), lag.max = 100)

d. Model IDing: diagnose with aic.wge to determine phi, thetas: Both AIC/BIC select ARMA(4,2)

#d
aic5.wge(inpos.diff.seas)
## ---------WORKING... PLEASE WAIT... 
## 
## 
## Five Smallest Values of  aic
##       p    q        aic
## 15    4    2   15.62368
## 9     2    2   15.74701
## 18    5    2   15.74771
## 12    3    2   15.75351
## 7     2    0   15.75551
aic5.wge(inpos.diff.seas, type = "bic")
## ---------WORKING... PLEASE WAIT... 
## 
## 
## Five Smallest Values of  bic
##       p    q        bic
## 15    4    2   15.74832
## 2     0    1   15.79201
## 7     2    0   15.80893
## 4     1    0   15.81457
## 3     0    2   15.81522
e. White Noise Test: Reject the H_null and accept this data is not white noise
#e
acf(inpos.diff.seas)

ljung.wge(inpos.diff.seas)$pval
## Obs -0.3711871 -0.02127283 0.07854862 0.05194601 -0.1044838 0.3276797 -0.4658847 0.1565306 0.1284186 -0.1060791 -0.07693327 0.1030723 -0.06391648 0.05002847 0.004471288 -0.03640442 -0.01551996 0.04133725 -0.01110058 -0.08153895 0.06570572 -0.03709739 -0.02390062 0.04588417
## [1] 1.246447e-12
f. Estiamte Phis, Thetas
g. Model Building
#f
est.inpos.seas = est.arma.wge(inpos.diff.seas, p = 4, q = 2)
## 
## Coefficients of Original polynomial:  
## -1.8727 -1.4974 -0.4735 0.0283 
## 
## Factor                 Roots                Abs Recip    System Freq 
## 1+1.1293B+0.6983B^2   -0.8086+-0.8822i      0.8357       0.3681
## 1+0.7944B             -1.2588               0.7944       0.5000
## 1-0.0510B              19.6244               0.0510       0.0000
##   
## 
mean(us$positiveIncrease)
## [1] 22567.12
#g
h. Forecast 7/90 Days 
#7 day
inpos.preds7 = fore.aruma.wge(us$positiveIncrease, phi = est.inpos.seas$phi, theta = est.inpos.seas$theta, d=1, s=7, n.ahead = 7)

#90 day
inpos.preds90 = fore.aruma.wge(us$positiveIncrease, phi = est.inpos.seas$phi, theta = est.inpos.seas$theta, d=1, s=7, n.ahead = 90)

i. Plotting forecasts

#7 day forecast
plot(seq(1,187,1), us$positiveIncrease, type = "l", xlim = c(0,195), ylim = c(0,80000), ylab = "Increase in COVID Cases", main = "7 Day Increase in COVID Cases Forecast")
lines(seq(188, 194,1), inpos.preds7$f, type = "l", col = "red")

#90 day forecast
plot(seq(1,187,1), us$positiveIncrease, type = "l", xlim = c(0,280), ylim = c(0,80000), ylab = "Increase in COVID Cases", main = "90 Day Increase in COVID Cases Forecast")
lines(seq(188, 277,1), inpos.preds90$f, type = "l", col = "red")

Forecast Currently Hospitalized Patients using Positive Increase and Total Cases

  1. Data prep and recognizing differencing and seaonal components of Currently Hospitalized
#Selecting only those dates with reported current hospitilizations
us <- dplyr::slice(us,56:187)
invisible(us)
#Stationarize Currently Hospitalized Variable
us.diff = artrans.wge(us$hospitalizedCurrently, phi.tr = 1, lag.max = 100)

acf(us.diff)
us.diff.seas = artrans.wge(us.diff,phi.tr = c(0,0,0,0,0,0,1), lag.max = 100)

acf(us.diff.seas)
#Stationarize Increase Positive Variable
inpos.diff = artrans.wge(us$positiveIncrease, phi.tr = 1, lag.max = 100)

acf(inpos.diff)
inpos.diff.seas = artrans.wge(inpos.diff,phi.tr = c(0,0,0,0,0,0,1), lag.max = 100)

acf(inpos.diff.seas)

  1. Fit simple regression model predicting hospitalizedCurrently with positiveIncrease.
mlr.fit = lm(us.diff.seas~inpos.diff.seas, data=cbind(data.frame(us.diff.seas), data.frame(inpos.diff.seas)))
summary(mlr.fit)
## 
## Call:
## lm(formula = us.diff.seas ~ inpos.diff.seas, data = cbind(data.frame(us.diff.seas), 
##     data.frame(inpos.diff.seas)))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7149.3  -554.2    73.4   611.2  6506.7 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     -13.32614  143.23299  -0.093  0.92603   
## inpos.diff.seas   0.11733    0.04227   2.776  0.00637 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1595 on 122 degrees of freedom
## Multiple R-squared:  0.0594, Adjusted R-squared:  0.0517 
## F-statistic: 7.705 on 1 and 122 DF,  p-value: 0.006375
AIC(mlr.fit)
## [1] 2184.712
acf(mlr.fit$residuals)

plot(mlr.fit$residuals)

  1. Diagnose with aic.wge and see what the p, q would be with residuals from above model
    1. Below we can see that our aic.wge function has selected an ARMA(1,2) model for modeling our cmort information.
mlr.phis= aic.wge(mlr.fit$residuals)
mlr.phis
## $type
## [1] "aic"
## 
## $value
## [1] 14.53403
## 
## $p
## [1] 5
## 
## $q
## [1] 2
## 
## $phi
## [1] -0.2465069 -0.4219100  0.3119255  0.2116968  0.2024803
## 
## $theta
## [1] -0.4657537 -0.9566014
## 
## $vara
## [1] 1803071
  1. Now forcast with arima function with phi’s from above coefficients and ARMA(1,2) model,
mlr.fit.arima = arima(us$hospitalizedCurrently, order = c(5,1,2), seasonal = list(order = c(1,0,0), period = 7), xreg = us$positiveIncrease)
## Warning in log(s2): NaNs produced
AIC(mlr.fit.arima)
## [1] 2224.09
  1. Run test to see if data is white noise: confirmed white noise, continue forward
acf(mlr.fit.arima$resid) 

ltest1 = ljung.wge(mlr.fit.arima$resid) 
## Obs -0.006649913 -0.009334226 -0.01070918 0.01827832 -0.02771128 -0.01061308 -0.01123975 0.08781402 -0.05809279 0.07500195 0.05661161 -0.08527475 -0.007553036 0.1098068 0.002427647 -0.010962 -0.04242877 -0.001526285 -0.05612271 -0.03313627 0.02930685 -0.007307643 -0.06056842 -0.03342384
ltest1$pval
## [1] 0.999208
ltest2 = ljung.wge(mlr.fit.arima$resid, K= 48)
## Obs -0.006649913 -0.009334226 -0.01070918 0.01827832 -0.02771128 -0.01061308 -0.01123975 0.08781402 -0.05809279 0.07500195 0.05661161 -0.08527475 -0.007553036 0.1098068 0.002427647 -0.010962 -0.04242877 -0.001526285 -0.05612271 -0.03313627 0.02930685 -0.007307643 -0.06056842 -0.03342384 0.08407797 -0.006875273 -0.002312484 0.06777622 -0.09591675 -0.06198868 -0.01331578 -0.03441461 -0.04091228 0.0270524 0.06615092 0.06538511 -0.09261011 -0.04939741 -0.03187124 -0.04934637 -0.01834029 -0.02031817 -0.001873916 0.02337696 -0.004828854 0.006148926 -0.002455308 0.004810674
ltest2$pval
## [1] 0.9999866
  1. Load the forecasted increase positive in a data frame
    1. 7 Day
#7 Day Case Increase
regs7 = data.frame(positiveIncrease = inpos.preds7$f)
invisible(regs7)
b. 90 Day
#90 Day Case Increase
regs90 = data.frame(positiveIncrease = inpos.preds90$f)
invisible(regs90)
  1. Predictions
    1. 7 Day
mlr1.preds7 = predict(mlr.fit.arima, newxreg = regs7, n.ahead = 7)
invisible(mlr1.preds7)
b. 90 Day
mlr1.preds90 = predict(mlr.fit.arima, newxreg = regs90, n.ahead =90)
invisible(mlr1.preds90)
  1. Plotted Forecasts
    1. 7 Day
plot(seq(1,132,1), us$hospitalizedCurrently, type = "l",  xlim = c(0,140), ylim = c(0,60000), ylab = "Currently Hospitalized COVID Cases", main = "7 Day Forecast for COVID Hospitalized Cases")
lines(seq(133,139,1), mlr1.preds7$pred, type = "l", col = "red")

b. 90 Day

plot(seq(1,132,1), us$hospitalizedCurrently, type = "l",  xlim = c(0,223), ylim = c(0,60000), ylab = "Currently Hospitalized COVID Cases", main = "90 Day Forecast for COVID Hospitalized Cases")
lines(seq(133,222,1), mlr1.preds90$pred, type = "l", col = "red")

  1. Windowed ASE

MLR w/ Correlated Errors for Currently Hospitalized Patients w/ Trend

  1. Fit simple regression model predicting hospitalizedCurrently with positiveIncrease and trend
time <- seq(1,124,1)
mlr.fit.t = lm(us.diff.seas~inpos.diff.seas+time, data=cbind(data.frame(us.diff.seas), data.frame(inpos.diff.seas)))
summary(mlr.fit.t)
## 
## Call:
## lm(formula = us.diff.seas ~ inpos.diff.seas + time, data = cbind(data.frame(us.diff.seas), 
##     data.frame(inpos.diff.seas)))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7111.0  -524.5    73.9   617.7  6540.6 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)   
## (Intercept)      32.35112  289.27171   0.112  0.91114   
## inpos.diff.seas   0.11724    0.04244   2.762  0.00663 **
## time             -0.73096    4.01658  -0.182  0.85590   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1601 on 121 degrees of freedom
## Multiple R-squared:  0.05966,    Adjusted R-squared:  0.04412 
## F-statistic: 3.839 on 2 and 121 DF,  p-value: 0.02419
AIC(mlr.fit.t)
## [1] 2186.678
acf(mlr.fit.t$residuals)

plot(mlr.fit.t$residuals)

  1. Diagnose with aic.wge and see what the p, q would be with residuals from above model
    1. Below we can see that our aic.wge function has selected an ARMA(1,2) model for modeling our cmort information.
mlr.phis= aic.wge(mlr.fit.t$residuals)
mlr.phis
## $type
## [1] "aic"
## 
## $value
## [1] 14.53329
## 
## $p
## [1] 5
## 
## $q
## [1] 2
## 
## $phi
## [1] -0.2466126 -0.4220817  0.3118538  0.2119376  0.2027949
## 
## $theta
## [1] -0.4655288 -0.9564592
## 
## $vara
## [1] 1801724
  1. Now forcast with arima function with phi’s from above coefficients and ARMA(1,2) model,
Time <- seq(1,132,1)
mlr.fit.arima.t = arima(us$hospitalizedCurrently, order = c(5,1,2), seasonal = list(order = c(1,0,0), period = 7), xreg = cbind(us$positiveIncrease, Time))
AIC(mlr.fit.arima.t)
## [1] 2230.756
  1. Run test to see if data is white noise; confirmed white noise continue forward.
acf(mlr.fit.arima.t$resid) 

ltest1 = ljung.wge(mlr.fit.arima.t$resid) 
## Obs 0.002267493 0.004743329 0.004085712 0.008584111 0.003613306 0.03592544 -0.02552583 0.08439651 -0.1060931 0.01276058 0.02640347 -0.06061388 0.05268146 0.1458814 0.008052735 -0.04255608 -0.0935536 -0.04208266 -0.0498013 -0.001717023 0.05934226 0.01341049 -0.07137289 -0.07279666
ltest1$pval
## [1] 0.9823161
ltest2 = ljung.wge(mlr.fit.arima.t$resid, K= 48)
## Obs 0.002267493 0.004743329 0.004085712 0.008584111 0.003613306 0.03592544 -0.02552583 0.08439651 -0.1060931 0.01276058 0.02640347 -0.06061388 0.05268146 0.1458814 0.008052735 -0.04255608 -0.0935536 -0.04208266 -0.0498013 -0.001717023 0.05934226 0.01341049 -0.07137289 -0.07279666 0.05320601 -0.01313386 0.01794322 0.1027134 -0.07595104 -0.06199621 -0.03306343 -0.06206906 -0.03944973 0.03988941 0.07572305 0.07297186 -0.1029145 -0.06857141 -0.04403521 -0.04289784 0.005428483 -0.003871671 -7.833593e-05 0.005208028 -0.03085214 -0.01036108 0.00645461 0.02868683
ltest2$pval
## [1] 0.9990337
  1. Load the forecasted increase positive in a data frame
    1. 7 Day
#7 Day Case Increase
regs7t = data.frame(cbind(positiveIncrease = inpos.preds7$f, Time = seq(133,139)))
invisible(regs7)
b. 90 Day
#90 Day Case Increase
regs90t = data.frame(cbind(positiveIncrease = inpos.preds90$f, Time = seq(133,222)))
invisible(regs90)
  1. Predictions
    1. 7 Day
mlr1.preds7.t = predict(mlr.fit.arima.t, newxreg = regs7t, n.ahead = 7)
invisible(mlr1.preds7.t)
b. 90 Day
mlr1.preds90.t = predict(mlr.fit.arima.t, newxreg = regs90t, n.ahead =90)
invisible(mlr1.preds90.t)
  1. Plotted Forecasts
    1. 7 Day
plot(seq(1,132,1), us$hospitalizedCurrently, type = "l",  xlim = c(0,140), ylim = c(0,60000), ylab = "Currently Hospitalized COVID Cases", main = "7 Day Forecast for COVID Hospitalized Cases")
lines(seq(133,139,1), mlr1.preds7.t$pred, type = "l", col = "red")

b. 90 Day

plot(seq(1,132,1), us$hospitalizedCurrently, type = "l",  xlim = c(0,223), ylim = c(0,85000), ylab = "Currently Hospitalized COVID Cases", main = "90 Day Forecast for COVID Hospitalized Cases")
lines(seq(133,222,1), mlr1.preds90.t$pred, type = "l", col = "red")

MLR w/ Correlated Errors (lagged variables) for Currently Hospitalized Patients

With a quick check, we can see that there is no lag correlationed between the increase of COVID patients and hospitalized patients, like we theorized there might be. So we will not model an MLR w/ correlated errors on lagged variables.

ccf(us$positiveIncrease,us$hospitalizedCurrently)

VAR Model

Since we are working with a seasonal and transformation component but it requires an additional transformation for the total positive COVID cases to become stationary for evaluation, we will only use positive increase of cases to predict currently hospitzliaed cases for the VAR model.

  1. Differenced and Seasonal Transformation VAR Model
#Positive Cases Transformations
#pos.diff = artrans.wge(us$positive, phi.tr = 1, lag.max = 100)
#pos.diff.2 = artrans.wge(pos.diff, phi.tr = 1, lag.max = 100)
#pos.trans = artrans.wge(pos.diff.2,phi.tr = c(0,0,0,0,0,0,1), lag.max = 100)
#Increase Positive Transformation
inpos.diff = artrans.wge(us$positiveIncrease, phi.tr = 1, lag.max = 100)

inpos.trans = artrans.wge(inpos.diff,phi.tr = c(0,0,0,0,0,0,1), lag.max = 100)

#Current Hospitalization Transformations
us.diff = artrans.wge(us$hospitalizedCurrently, phi.tr = 1, lag.max = 100)

currhosp.trans = artrans.wge(us.diff,phi.tr = c(0,0,0,0,0,0,1), lag.max = 100)

  1. Use Var select to find best model and fit model: p = 8 for lowest AIC
#VARselect to choose lag
VARselect(cbind(currhosp.trans, inpos.trans), lag.max = 10, type= "both")
## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      7      7      2      7 
## 
## $criteria
##                   1            2            3            4            5
## AIC(n) 3.093857e+01 3.083329e+01 3.079091e+01 3.076976e+01 3.069554e+01
## HQ(n)  3.101650e+01 3.095018e+01 3.094676e+01 3.096458e+01 3.092932e+01
## SC(n)  3.113058e+01 3.112131e+01 3.117493e+01 3.124979e+01 3.127158e+01
## FPE(n) 2.731960e+13 2.459304e+13 2.357880e+13 2.309552e+13 2.145770e+13
##                   6            7            8            9           10
## AIC(n) 3.066656e+01 3.051427e+01 3.053424e+01 3.059674e+01 3.059649e+01
## HQ(n)  3.093930e+01 3.082598e+01 3.088492e+01 3.098638e+01 3.102509e+01
## SC(n)  3.133860e+01 3.128233e+01 3.139830e+01 3.155681e+01 3.165257e+01
## FPE(n) 2.086403e+13 1.793906e+13 1.833019e+13 1.955149e+13 1.959499e+13
#fit model
usdiff.var.fit = VAR(cbind(currhosp.trans, inpos.trans), type = "both", p = 2)
  1. Predictions for Difference
#7 day predictions
pred.var7 = predict(usdiff.var.fit, n.ahead = 7)
pred.var7
## $currhosp.trans
##            fcst     lower    upper       CI
## [1,]  -42.12350 -2942.410 2858.163 2900.286
## [2,] -385.17371 -3346.493 2576.145 2961.319
## [3,]  -80.65766 -3262.529 3101.214 3181.871
## [4,] -124.14161 -3327.324 3079.041 3203.182
## [5,]  -46.62794 -3284.538 3191.282 3237.910
## [6,]  -43.58380 -3288.207 3201.039 3244.623
## [7,]  -11.23201 -3262.277 3239.813 3251.045
## 
## $inpos.trans
##             fcst     lower    upper       CI
## [1,]  998.451219 -5160.669 7157.571 6159.120
## [2,]   75.758630 -6696.589 6848.106 6772.348
## [3,] -289.771376 -7075.922 6496.379 6786.150
## [4,]    2.972523 -6801.839 6807.784 6804.811
## [5,]   -7.811041 -6815.790 6800.167 6807.978
## [6,]  -72.355258 -6880.403 6735.692 6808.047
## [7,]  -29.105499 -6837.492 6779.281 6808.386
#90 day predictions
pred.var90 = predict(usdiff.var.fit, n.ahead = 90)
pred.var90
## $currhosp.trans
##              fcst     lower    upper       CI
##  [1,]  -42.123501 -2942.410 2858.163 2900.286
##  [2,] -385.173707 -3346.493 2576.145 2961.319
##  [3,]  -80.657665 -3262.529 3101.214 3181.871
##  [4,] -124.141611 -3327.324 3079.041 3203.182
##  [5,]  -46.627942 -3284.538 3191.282 3237.910
##  [6,]  -43.583795 -3288.207 3201.039 3244.623
##  [7,]  -11.232008 -3262.277 3239.813 3251.045
##  [8,]   -6.349200 -3259.226 3246.528 3252.877
##  [9,]    7.305513 -3246.870 3261.481 3254.175
## [10,]   12.421029 -3242.210 3267.053 3254.632
## [11,]   18.773716 -3236.131 3273.679 3254.905
## [12,]   22.480950 -3232.533 3277.495 3255.014
## [13,]   26.203504 -3228.870 3281.277 3255.073
## [14,]   28.923414 -3226.175 3284.022 3255.099
## [15,]   31.503265 -3223.608 3286.615 3255.112
## [16,]   33.699320 -3221.418 3288.817 3255.117
## [17,]   35.776726 -3219.344 3290.897 3255.120
## [18,]   37.697587 -3217.424 3292.819 3255.122
## [19,]   39.549962 -3215.572 3294.672 3255.122
## [20,]   41.334045 -3213.789 3296.457 3255.123
## [21,]   43.082251 -3212.041 3298.205 3255.123
## [22,]   44.799892 -3210.323 3299.923 3255.123
## [23,]   46.499486 -3208.623 3301.622 3255.123
## [24,]   48.185069 -3206.938 3303.308 3255.123
## [25,]   49.861824 -3205.261 3304.985 3255.123
## [26,]   51.532055 -3203.591 3306.655 3255.123
## [27,]   53.198022 -3201.925 3308.321 3255.123
## [28,]   54.860928 -3200.262 3309.984 3255.123
## [29,]   56.521787 -3198.601 3311.645 3255.123
## [30,]   58.181203 -3196.942 3313.304 3255.123
## [31,]   59.839642 -3195.283 3314.963 3255.123
## [32,]   61.497398 -3193.626 3316.620 3255.123
## [33,]   63.154688 -3191.968 3318.278 3255.123
## [34,]   64.811654 -3190.311 3319.935 3255.123
## [35,]   66.468399 -3188.655 3321.591 3255.123
## [36,]   68.124990 -3186.998 3323.248 3255.123
## [37,]   69.781476 -3185.341 3324.904 3255.123
## [38,]   71.437889 -3183.685 3326.561 3255.123
## [39,]   73.094252 -3182.029 3328.217 3255.123
## [40,]   74.750580 -3180.372 3329.874 3255.123
## [41,]   76.406884 -3178.716 3331.530 3255.123
## [42,]   78.063172 -3177.060 3333.186 3255.123
## [43,]   79.719449 -3175.404 3334.842 3255.123
## [44,]   81.375717 -3173.747 3336.499 3255.123
## [45,]   83.031981 -3172.091 3338.155 3255.123
## [46,]   84.688240 -3170.435 3339.811 3255.123
## [47,]   86.344498 -3168.778 3341.467 3255.123
## [48,]   88.000753 -3167.122 3343.124 3255.123
## [49,]   89.657007 -3165.466 3344.780 3255.123
## [50,]   91.313260 -3163.810 3346.436 3255.123
## [51,]   92.969513 -3162.153 3348.092 3255.123
## [52,]   94.625766 -3160.497 3349.749 3255.123
## [53,]   96.282018 -3158.841 3351.405 3255.123
## [54,]   97.938270 -3157.185 3353.061 3255.123
## [55,]   99.594521 -3155.528 3354.717 3255.123
## [56,]  101.250773 -3153.872 3356.374 3255.123
## [57,]  102.907025 -3152.216 3358.030 3255.123
## [58,]  104.563276 -3150.560 3359.686 3255.123
## [59,]  106.219528 -3148.903 3361.342 3255.123
## [60,]  107.875779 -3147.247 3362.999 3255.123
## [61,]  109.532031 -3145.591 3364.655 3255.123
## [62,]  111.188282 -3143.935 3366.311 3255.123
## [63,]  112.844534 -3142.278 3367.967 3255.123
## [64,]  114.500785 -3140.622 3369.624 3255.123
## [65,]  116.157037 -3138.966 3371.280 3255.123
## [66,]  117.813288 -3137.310 3372.936 3255.123
## [67,]  119.469540 -3135.653 3374.592 3255.123
## [68,]  121.125791 -3133.997 3376.249 3255.123
## [69,]  122.782043 -3132.341 3377.905 3255.123
## [70,]  124.438294 -3130.685 3379.561 3255.123
## [71,]  126.094546 -3129.028 3381.218 3255.123
## [72,]  127.750797 -3127.372 3382.874 3255.123
## [73,]  129.407049 -3125.716 3384.530 3255.123
## [74,]  131.063300 -3124.060 3386.186 3255.123
## [75,]  132.719552 -3122.403 3387.843 3255.123
## [76,]  134.375803 -3120.747 3389.499 3255.123
## [77,]  136.032055 -3119.091 3391.155 3255.123
## [78,]  137.688306 -3117.435 3392.811 3255.123
## [79,]  139.344558 -3115.778 3394.468 3255.123
## [80,]  141.000809 -3114.122 3396.124 3255.123
## [81,]  142.657061 -3112.466 3397.780 3255.123
## [82,]  144.313312 -3110.810 3399.436 3255.123
## [83,]  145.969564 -3109.153 3401.093 3255.123
## [84,]  147.625815 -3107.497 3402.749 3255.123
## [85,]  149.282067 -3105.841 3404.405 3255.123
## [86,]  150.938318 -3104.185 3406.061 3255.123
## [87,]  152.594570 -3102.528 3407.718 3255.123
## [88,]  154.250821 -3100.872 3409.374 3255.123
## [89,]  155.907073 -3099.216 3411.030 3255.123
## [90,]  157.563324 -3097.560 3412.686 3255.123
## 
## $inpos.trans
##               fcst     lower    upper       CI
##  [1,]  998.4512190 -5160.669 7157.571 6159.120
##  [2,]   75.7586304 -6696.589 6848.106 6772.348
##  [3,] -289.7713760 -7075.922 6496.379 6786.150
##  [4,]    2.9725232 -6801.839 6807.784 6804.811
##  [5,]   -7.8110412 -6815.790 6800.167 6807.978
##  [6,]  -72.3552576 -6880.403 6735.692 6808.047
##  [7,]  -29.1054988 -6837.492 6779.281 6808.386
##  [8,]  -34.9571452 -6843.380 6773.465 6808.422
##  [9,]  -35.4432540 -6843.879 6772.993 6808.436
## [10,]  -32.3830184 -6840.821 6776.055 6808.438
## [11,]  -30.3314474 -6838.771 6778.108 6808.439
## [12,]  -29.7776966 -6838.217 6778.662 6808.440
## [13,]  -28.0132116 -6836.453 6780.427 6808.440
## [14,]  -26.8374352 -6835.278 6781.603 6808.440
## [15,]  -25.5838580 -6834.024 6782.856 6808.440
## [16,]  -24.3686742 -6832.809 6784.072 6808.440
## [17,]  -23.1474191 -6831.588 6785.293 6808.440
## [18,]  -21.9649128 -6830.405 6786.475 6808.440
## [19,]  -20.7724497 -6829.213 6787.668 6808.440
## [20,]  -19.5931375 -6828.033 6788.847 6808.440
## [21,]  -18.4149177 -6826.855 6790.025 6808.440
## [22,]  -17.2403978 -6825.681 6791.200 6808.440
## [23,]  -16.0668991 -6824.507 6792.373 6808.440
## [24,]  -14.8951470 -6823.335 6793.545 6808.440
## [25,]  -13.7239465 -6822.164 6794.716 6808.440
## [26,]  -12.5534643 -6820.994 6795.887 6808.440
## [27,]  -11.3833229 -6819.824 6797.057 6808.440
## [28,]  -10.2134881 -6818.654 6798.227 6808.440
## [29,]   -9.0438264 -6817.484 6799.396 6808.440
## [30,]   -7.8743061 -6816.315 6800.566 6808.440
## [31,]   -6.7048710 -6815.145 6801.735 6808.440
## [32,]   -5.5355009 -6813.976 6802.905 6808.440
## [33,]   -4.3661725 -6812.806 6804.074 6808.440
## [34,]   -3.1968743 -6811.637 6805.243 6808.440
## [35,]   -2.0275963 -6810.468 6806.413 6808.440
## [36,]   -0.8583325 -6809.299 6807.582 6808.440
## [37,]    0.3109216 -6808.129 6808.751 6808.440
## [38,]    1.4801691 -6806.960 6809.920 6808.440
## [39,]    2.6494119 -6805.791 6811.090 6808.440
## [40,]    3.8186516 -6804.622 6812.259 6808.440
## [41,]    4.9878891 -6803.452 6813.428 6808.440
## [42,]    6.1571250 -6802.283 6814.597 6808.440
## [43,]    7.3263600 -6801.114 6815.767 6808.440
## [44,]    8.4955942 -6799.945 6816.936 6808.440
## [45,]    9.6648279 -6798.775 6818.105 6808.440
## [46,]   10.8340613 -6797.606 6819.274 6808.440
## [47,]   12.0032944 -6796.437 6820.444 6808.440
## [48,]   13.1725274 -6795.268 6821.613 6808.440
## [49,]   14.3417602 -6794.098 6822.782 6808.440
## [50,]   15.5109930 -6792.929 6823.951 6808.440
## [51,]   16.6802258 -6791.760 6825.120 6808.440
## [52,]   17.8494585 -6790.591 6826.290 6808.440
## [53,]   19.0186911 -6789.422 6827.459 6808.440
## [54,]   20.1879238 -6788.252 6828.628 6808.440
## [55,]   21.3571564 -6787.083 6829.797 6808.440
## [56,]   22.5263891 -6785.914 6830.967 6808.440
## [57,]   23.6956217 -6784.745 6832.136 6808.440
## [58,]   24.8648543 -6783.575 6833.305 6808.440
## [59,]   26.0340870 -6782.406 6834.474 6808.440
## [60,]   27.2033196 -6781.237 6835.644 6808.440
## [61,]   28.3725522 -6780.068 6836.813 6808.440
## [62,]   29.5417848 -6778.898 6837.982 6808.440
## [63,]   30.7110174 -6777.729 6839.151 6808.440
## [64,]   31.8802501 -6776.560 6840.320 6808.440
## [65,]   33.0494827 -6775.391 6841.490 6808.440
## [66,]   34.2187153 -6774.222 6842.659 6808.440
## [67,]   35.3879479 -6773.052 6843.828 6808.440
## [68,]   36.5571805 -6771.883 6844.997 6808.440
## [69,]   37.7264132 -6770.714 6846.167 6808.440
## [70,]   38.8956458 -6769.545 6847.336 6808.440
## [71,]   40.0648784 -6768.375 6848.505 6808.440
## [72,]   41.2341110 -6767.206 6849.674 6808.440
## [73,]   42.4033436 -6766.037 6850.844 6808.440
## [74,]   43.5725763 -6764.868 6852.013 6808.440
## [75,]   44.7418089 -6763.698 6853.182 6808.440
## [76,]   45.9110415 -6762.529 6854.351 6808.440
## [77,]   47.0802741 -6761.360 6855.521 6808.440
## [78,]   48.2495067 -6760.191 6856.690 6808.440
## [79,]   49.4187393 -6759.021 6857.859 6808.440
## [80,]   50.5879720 -6757.852 6859.028 6808.440
## [81,]   51.7572046 -6756.683 6860.197 6808.440
## [82,]   52.9264372 -6755.514 6861.367 6808.440
## [83,]   54.0956698 -6754.345 6862.536 6808.440
## [84,]   55.2649024 -6753.175 6863.705 6808.440
## [85,]   56.4341351 -6752.006 6864.874 6808.440
## [86,]   57.6033677 -6750.837 6866.044 6808.440
## [87,]   58.7726003 -6749.668 6867.213 6808.440
## [88,]   59.9418329 -6748.498 6868.382 6808.440
## [89,]   61.1110655 -6747.329 6869.551 6808.440
## [90,]   62.2802982 -6746.160 6870.721 6808.440
  1. Calculate Actual Forecasts from Predicted Differences
startingPoints7 = us$hospitalizedCurrently[126:132]
currHospForecasts7 = pred.var7$fcst$currhosp.trans[,1:3] + startingPoints7
startingPoints90 = us$hospitalizedCurrently[43:132]
currHospForecasts90 = pred.var90$fcst$currhosp.trans[,1:3] + startingPoints90
  1. Plotting Forecasts
#7 day Forecasts
plot(seq(1,132,1), us$hospitalizedCurrently, type = "l", xlim = c(0,139), ylab = "Currently Hospitalized COVID Patients", main = "7 Day Currently Hospitalized Patients Forecast")
lines(seq(133,139,1), as.data.frame(currHospForecasts7)$fcst, type = "l", col = "red")

#90 day Forecasts
plot(seq(1,132,1), us$hospitalizedCurrently, type = "l", xlim = c(0,222), ylab = "Currently Hospitalized COVID Patients", main = "90 Day Currently Hospitalized Patients Forecast")
lines(seq(133,222,1), as.data.frame(currHospForecasts90)$fcst, type = "l", col = "red")

  1. Calculate ASE # WHERE YOU LEFT OFF
#varASE = mean((us$hospitalizedCurrently[123:132]-pred.var$fcst$y1[1:10])^2)
#varASE
  1. Windowed ASE

Multilayer Perceptron Model (MLP) / NN

Since we have been looking at additional regressors for all our above models and know that the best regressor to leverage will be positive increase of COVID cases. This regressor reflects similar behavior to that of current hospitlizations and can be model properly against hospitalizations. 1. Train data set

currhosp.train = ts(us$hospitalizedCurrently[1:125])
  1. Create data frame of regressors
incposreg = data.frame(increase = ts(us$positiveIncrease))
incposreg
##     increase
## 1       3613
## 2       3171
## 3       4671
## 4       6255
## 5       6885
## 6       9259
## 7      11442
## 8      10632
## 9      12873
## 10     17656
## 11     19044
## 12     19724
## 13     19655
## 14     21897
## 15     24694
## 16     25773
## 17     27992
## 18     31945
## 19     33252
## 20     25550
## 21     28937
## 22     30737
## 23     30534
## 24     34419
## 25     34351
## 26     30561
## 27     27844
## 28     25133
## 29     25631
## 30     30298
## 31     30923
## 32     31964
## 33     27963
## 34     27468
## 35     25872
## 36     26333
## 37     28916
## 38     31784
## 39     34174
## 40     35901
## 41     27380
## 42     22605
## 43     25219
## 44     26641
## 45     29568
## 46     33056
## 47     29185
## 48     25767
## 49     22523
## 50     22449
## 51     25056
## 52     27490
## 53     27605
## 54     24810
## 55     21504
## 56     18236
## 57     22663
## 58     21193
## 59     26705
## 60     24710
## 61     24701
## 62     20171
## 63     20992
## 64     20853
## 65     21319
## 66     26513
## 67     24555
## 68     21590
## 69     20105
## 70     18798
## 71     16629
## 72     19385
## 73     22601
## 74     23522
## 75     23762
## 76     21639
## 77     20415
## 78     19982
## 79     20312
## 80     20839
## 81     23352
## 82     23106
## 83     18823
## 84     17012
## 85     17175
## 86     20803
## 87     22032
## 88     23477
## 89     25341
## 90     21373
## 91     18510
## 92     23435
## 93     23873
## 94     27527
## 95     31046
## 96     31994
## 97     27284
## 98     27017
## 99     33021
## 100    38684
## 101    39072
## 102    44421
## 103    43783
## 104    41857
## 105    39175
## 106    47462
## 107    50674
## 108    53826
## 109    54223
## 110    54734
## 111    45789
## 112    41600
## 113    51766
## 114    62147
## 115    58836
## 116    66645
## 117    63007
## 118    60978
## 119    58465
## 120    62879
## 121    65382
## 122    70953
## 123    77233
## 124    65180
## 125    64884
## 126    56971
## 127    63642
## 128    69150
## 129    71027
## 130    75193
## 131    65413
## 132    61713
  1. Fit a multilayer perceptron model (nn) w/ Increase Positive Cases Regressor
currhosp.default.fit = mlp(currhosp.train, xreg = incposreg)
plot(currhosp.default.fit)

  1. Forecast w/ test data set
currhosp.reg.fore = forecast(currhosp.default.fit, h = 7, xreg = incposreg)
plot(currhosp.reg.fore)

  1. Plot Observed v Forecasted
plot(us$hospitalizedCurrently[126:132], type = "l", ylim = c(58000,63000))
lines(seq(1:7), currhosp.reg.fore$mean, col = "blue")

  1. ASEs
ASE.default.mlp = mean((us$hospitalizedCurrently[126:132]-currhosp.reg.fore$mean)^2)
ASE.default.mlp
## [1] 1571071
  1. Default Forecasts
  1. 7 Day Forecast Model # Error
#currhosp.mlp = mlp(ts(us$hospitalizedCurrently), comb = "mean", xreg = incposreg)
#currhosp.mlp
#plot(currhosp.mlp)

#currhosp.forecast7 = forecast(currhosp.mlp, h=7, xreg = incposreg)
#plot(currhosp.forecast7)

#fit.mlp = mlp(ts(SM$Melanoma),reps = 50,comb = "mean",xreg = SM)
#fit.mlp
#plot(fit.mlp)
#SMDF = data.frame(Sunspot = ts(c(SM$Sunspot,SMsmallDF_fore$Sunspot)))
#fore.mlp = forecast(fit.mlp, h = 8, xreg = SMDF)
#plot(fore.mlp)
  1. 90 Day Forecast Model # Error
#currhosp.forecast90 = forecast(currhosp.fit.default, h = 90, xreg = incposreg)
#plot(currhosp.forecast90)

Ensemble Model